Ward's Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward's Criterion?
نویسندگان
چکیده
The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. Two algorithms are found in the literature and software, both announcing that they implement the Ward clustering method. When applied to the same distance matrix, they produce different results. One algorithm preserves Ward’s criterion, the other does not. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward’s hierarchical clustering method.
منابع مشابه
Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based approach
We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms – Ward’s method, single-link, complete-link, and a variant of group-average – are each equivalent to a hierarchical model-based method. This interpretation gives a theoretical explanation of the empirical b...
متن کاملEvaluatoin of Agglomerative Hierarchical Clustering Methods
This paper describes the findings from evaluating the performance of agglomerative hierarchical cluster methods for determining seasonal factor groups. Seasonal factor groups are usually determined by traditional cluster analysis based on various similarity measures. Agglomerative hierarchical methods merge telemetry traffic monitoring sites (TTMSs) into groups according to their similarities. ...
متن کاملMethods for detecting functional classifications in neuroimaging data.
Data-driven statistical methods are useful for examining the spatial organization of human brain function. Cluster analysis is one approach that aims to identify spatial classifications of temporal brain activity profiles. Numerous clustering algorithms are available, and no one method is optimal for all areas of application because an algorithm's performance depends on specific characteristics...
متن کاملGeneralising Ward’s Method for Use with Manhattan Distances
The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperfo...
متن کاملEfficient Agglomerative clustering Method for Micro Array Data on Breast Cancer Outcome
Analysis of micro arrays presents a number of unique challenges for data mining. The main types of data analysis needed for biomedical applications includeclusteringfinding new biological classes or refining an existing one. We compare the various experimental clustering results of S+ from Insightful, XCluster at Stanford, Eisen’s Cluster, and Rousseau & Kaufman’s Web clusters for single linkag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Classification
دوره 31 شماره
صفحات -
تاریخ انتشار 2014